A Stochastic Composite Augmented Lagrangian Method for Reinforcement Learning

نویسندگان

چکیده

In this paper, we consider the linear programming (LP) formulation for deep reinforcement learning. The number of constraints depends on size state and action spaces, which makes problem intractable in large or continuous environments. general augmented Lagrangian method suffers double-sampling obstacle solving program. Motivated from updates multipliers, overcome obstacles minimizing function by replacing conditional expectations with multipliers. Therefore, a parameterized is proposed. replacement provides promising breakthrough to integrate two steps into single quadratic penalty problem. A theoretical analysis shows that solutions generated sequence constrained optimization converge optimal solution program if error controlled properly. algorithm without using target networks under neural tangent kernel setting residual can be arbitrarily small parameter network chosen suitably. Preliminary experiments illustrate our competitive other state-of-the-art algorithms.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The proximal augmented Lagrangian method for nonsmooth composite optimization

We study a class of optimization problems in which the objective function is given by the sum of a differentiable but possibly nonconvex component and a nondifferentiable convex regularization term. We introduce an auxiliary variable to separate the objective function components and utilize the Moreau envelope of the regularization term to derive the proximal augmented Lagrangian – a continuous...

متن کامل

Augmented Lagrangian Filter Method∗

We introduce a filter mechanism to force convergence for augmented Lagrangian methods for nonlinear programming. In contrast to traditional augmented Lagrangian methods, our approach does not require the use of forcing sequences that drive the first-order error to zero. Instead, we employ a filter to drive the optimality measures to zero. Our algorithm is flexible in the sense that it allows fo...

متن کامل

Solving Environmental/Economic Power Dispatch Problem by a Trust Region Based Augmented Lagrangian Method

This paper proposes a Trust-Region Based Augmented Method (TRALM) to solve a combined Environmental and Economic Power Dispatch (EEPD) problem. The EEPD problem is a multi-objective problem with competing and non-commensurable objectives. The TRALM produces a set of non-dominated Pareto optimal solutions for the problem. Fuzzy set theory is employed to extract a compromise non-dominated sol...

متن کامل

An augmented Lagrangian method for distributed optimization

We propose a novel distributed method for convex optimization problems with a certain separability structure. The method is based on the augmented Lagrangian framework. We analyze its convergence and provide an application to two network models, as well as to a two-stage stochastic optimization problem. The proposed method compares favorably to two augmented Lagrangian decomposition methods kno...

متن کامل

PENNON A Generalized Augmented Lagrangian Method for Semidefinite Programming

This article describes a generalization of the PBM method by Ben-Tal and Zibulevsky to convex semidefinite programming problems. The algorithm used is a generalized version of the Augmented Lagrangian method. We present details of this algorithm as implemented in a new code PENNON. The code can also solve second-order conic programming (SOCP) problems, as well as problems with a mixture of SDP,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Siam Journal on Optimization

سال: 2023

ISSN: ['1095-7189', '1052-6234']

DOI: https://doi.org/10.1137/21m1421726